Skip to content

Conversation

@AkhileshNegi
Copy link
Collaborator

@AkhileshNegi AkhileshNegi commented Feb 8, 2026

Summary

Target issue is #493

Checklist

Before submitting a pull request, please ensure that you mark these task.

  • Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
  • If you've fixed a bug or added code that is tested and has test cases.

Notes

  • Restructured evaluation processing to aggregate pending evaluations by project
  • Updated cron endpoint response format to display per-run details instead of organization-level summaries
  • Improved error handling and failure tracking per project

@coderabbitai
Copy link

coderabbitai bot commented Feb 8, 2026

📝 Walkthrough

Walkthrough

The evaluation cron processing system is refactored from organization-centric grouping to project-centric grouping. The polling function signature simplifies by removing the organization ID parameter, processing all pending evaluation runs across all organizations and grouping them by project. Response structures are updated to reflect per-run details instead of per-organization summaries.

Changes

Cohort / File(s) Summary
Cron Route Endpoint
backend/app/api/routes/cron.py
Removed auth-related imports (AuthContextDep, User), updated docstring to reflect project-based processing, and modified response structure to remove organizations_processed field while retaining total_processed, total_failed, and total_still_processing.
Cron Processing Logic
backend/app/crud/evaluations/cron.py, backend/app/crud/evaluations/processing.py
Refactored from per-organization looping to unified delegation model. poll_all_pending_evaluations signature changed from (session, org_id) to (session). Added project-based grouping with per-project OpenAI/Langfuse client initialization. Introduced synchronous wrapper function process_all_pending_evaluations_sync. Updated error handling to be project-scoped rather than organization-scoped.
Test Updates
backend/app/tests/api/routes/test_cron.py, backend/app/tests/crud/evaluations/test_processing.py
Updated cron endpoint tests to expect run-level fields (run_id, run_name, action) instead of org-level fields. Removed org_id argument from poll_all_pending_evaluations test calls. Updated assertions to validate total_processed, total_failed, and total_still_processing without organization-specific expectations.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant CronRoute as Cron Route<br/>(cron.py)
    participant ProcessCron as Process Cron<br/>(cron.py)
    participant PollFunc as Poll Function<br/>(processing.py)
    participant Database as Database<br/>Session
    participant ProjectGroup as Project<br/>Grouping

    Client->>CronRoute: GET /cron/evaluation_jobs
    CronRoute->>ProcessCron: process_all_pending_evaluations_sync(session)
    ProcessCron->>PollFunc: await poll_all_pending_evaluations(session)
    PollFunc->>Database: Fetch all pending evaluation runs
    Database-->>PollFunc: List of pending runs
    PollFunc->>ProjectGroup: Group runs by project_id
    ProjectGroup-->>PollFunc: Dict[project_id, List[runs]]
    loop For each project
        PollFunc->>PollFunc: Initialize OpenAI/Langfuse<br/>clients per project
        PollFunc->>PollFunc: Process all runs<br/>in project
        PollFunc->>Database: Update run statuses/results
    end
    PollFunc-->>ProcessCron: Summary dict<br/>(total_processed, total_failed,<br/>total_still_processing, details)
    ProcessCron-->>CronRoute: Processed response
    CronRoute-->>Client: Response with run-level<br/>details (no org grouping)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • Kaapi v1.0: Enhancing the test suite #488: Extensive test additions and updates for evaluation processing and cron endpoint that directly align with the signature and behavior changes in poll_all_pending_evaluations and process_all_pending_evaluations functions.

Suggested labels

enhancement, ready-for-review

Suggested reviewers

  • Prajna1999

Poem

Hop along with projects now so bright, 🐰✨
No more orgs to group in sight,
Every run polled far and wide,
Grouped by project, full of pride!

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Evaluation: Refactor cron' is vague and lacks specificity about what refactoring was performed or why it matters; it uses generic terminology without conveying the actual change. Consider a more descriptive title that explains the key change, such as 'Evaluation: Refactor cron to group processing by project instead of organization' or 'Evaluation: Consolidate evaluation polling into single query'.
✅ Passed checks (2 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch enhancement/evaluation-refactor-cron

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Feb 8, 2026

Codecov Report

❌ Patch coverage is 90.90909% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
backend/app/crud/evaluations/cron.py 50.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@AkhileshNegi AkhileshNegi marked this pull request as ready for review February 9, 2026 04:09
@AkhileshNegi AkhileshNegi linked an issue Feb 9, 2026 that may be closed by this pull request
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
backend/app/crud/evaluations/processing.py (1)

750-755: ⚠️ Potential issue | 🟡 Minor

"embeddings_completed" and "embeddings_failed" are counted as still_processing.

check_and_process_evaluation can return actions "embeddings_completed" and "embeddings_failed" (lines 515, 536), both of which leave the eval run in "completed" status. Because they don't match "processed" or "failed", they fall into the else branch and increment total_still_processing_count, which is misleading in the summary.

Proposed fix
                     if result["action"] == "processed":
                         total_processed_count += 1
-                    elif result["action"] == "failed":
+                    elif result["action"] in ("failed", "embeddings_failed"):
                         total_failed_count += 1
+                    elif result["action"] == "embeddings_completed":
+                        total_processed_count += 1
                     else:
                         total_still_processing_count += 1
backend/app/api/routes/cron.py (1)

54-60: ⚠️ Potential issue | 🟡 Minor

Error response is missing "results" key, inconsistent with the success shape.

The success path (line 47) returns a dict that always contains "results" (set by process_all_pending_evaluations in cron.py). The route-level error handler here omits it, which may break callers expecting a uniform schema.

Proposed fix
         return {
             "status": "error",
             "error": str(e),
             "total_processed": 0,
             "total_failed": 0,
             "total_still_processing": 0,
+            "results": [],
         }
🧹 Nitpick comments (3)
backend/app/crud/evaluations/processing.py (1)

699-701: Deriving org_id from the first run is safe given the data model, but fragile if the assumption changes.

org_id = project_runs[0].organization_id relies on all runs sharing the same org for a given project. This holds because Project belongs to a single Organization, but consider adding a brief comment explaining why this is safe (the FK relationship), so future readers don't question it.

backend/app/crud/evaluations/cron.py (1)

66-78: asyncio.run() may conflict with an existing event loop.

asyncio.run() creates a new event loop and fails with RuntimeError if one is already running. While FastAPI runs sync endpoints in a threadpool (no active loop there), this is fragile — if the endpoint is ever changed to async def, or if this wrapper is called from any async context, it will break. Consider making the route handler async def and awaiting process_all_pending_evaluations directly.

Alternative: make the endpoint async and drop the sync wrapper

In backend/app/api/routes/cron.py:

-def evaluation_cron_job(
+async def evaluation_cron_job(
     session: SessionDep,
 ) -> dict:
-        result = process_all_pending_evaluations_sync(session=session)
+        result = await process_all_pending_evaluations(session=session)

Then this sync wrapper can be removed entirely.

backend/app/api/routes/cron.py (1)

19-21: Return type hint could be more specific.

Per coding guidelines, all return values should have type hints. -> dict could be -> dict[str, Any] for consistency with the rest of the codebase. As per coding guidelines, "Always add type hints to all function parameters and return values in Python code".

Proposed fix
-def evaluation_cron_job(
-    session: SessionDep,
-) -> dict:
+def evaluation_cron_job(
+    session: SessionDep,
+) -> dict[str, Any]:

You'll also need to add from typing import Any to the imports.

@AkhileshNegi AkhileshNegi self-assigned this Feb 9, 2026
@AkhileshNegi AkhileshNegi added the enhancement New feature or request label Feb 9, 2026
@AkhileshNegi AkhileshNegi merged commit 790048a into main Feb 9, 2026
2 of 3 checks passed
@AkhileshNegi AkhileshNegi deleted the enhancement/evaluation-refactor-cron branch February 9, 2026 05:19
@AkhileshNegi AkhileshNegi mentioned this pull request Feb 9, 2026
14 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Evaluation: Refactoring CRON

2 participants